In this tutorial we will introduce the packages required for geospatial work in R, show how to read in a few spatial data types, demonstrate several ways to calculate landscape metrics, and briefly touch on spatial statistics. It is assumed that you have R and RStudio installed and that you, at a minimum, understand the basic concepts of the R language (e.g. you can work throughR For Cats).
Within in the last several years there has been a lot of effort spent on adding spatial data handling and analysis capability to R. Thanks to the very significant effort of the package authors we now have the foundation for doing GIS entirely within R.
The four packages that provide this foundation are:
Let’s dig in a bit deeper on each of these.
The sp package provides the primary spatial data structures for use in R. Many other packages assume your data is stored as one of the sp data structures. Getting into the details of these is beyond the scope of this workshop, but look at the introduction to sp vignette for more details. That being said, we will be working mostly with SpatialPointsDataFrame and SpatialPolygonsDataFrame. More on that later.
Getting sp added is no different than adding any other package that is on CRAN.
install.packages("sp")
library("sp")
The rgdal package provides tools for reading and writing multiple spatial data formats. It is based on the Geospatial Data Abstraction Library (GDAL) which is a project of the Open Source Geospatial Foundation (OSGeo). The primary use of rgdal is to read various spatial data formats into R and store them as sp objects. In this workshop, we will be only using rgdal to read in shape files, but it has utility far beyond that.
As before, nothing special to get set up with rgdal on windows. Simply:
install.packages("rgdal")
library("rgdal")
Getting set up on Linux or Mac requires more effort (i.e. need to have GDAL installed). As this is for a USEPA audience the windows installs will work for most. Thus, discussion of this is mostly beyond the scope of this workshop.
Although sp and rgdal provide raster data capabilities, they do require that the full raster dataset be read into memory. This can have some performance implications as well as limits the size of datasets you can readily work with. The raster package works around this by working with raster data on the disk. That too has some performance implications, but for the most part, in my opinion anyway, raster makes it easier to work with raster data. It also provides several additional functions for analyzing raster data.
To install, just do:
install.packages("raster")
library("raster")
The last of the core packages for doing GIS in R is rgeos. Like rgdal, rgeos is a project of OSgeo. It is a wrapper around the Geometry Engine Open Source c++ library and provides a suite of tools for conducting vector GIS analyses.
To install
install.packages("rgeos")
library("rgeos")
For Linux and Mac the GEOS library will also need to be available. As with rgdal this is a bit beyond the scope of this tutorial.
Things are changing quickly in the R/spatial analysis world and the most fundamental change is via the sf package. This package aims to replace sp, rgdal, and rgeos. There are a lot of reasons why this is a good thing, but that is a bit beyond the scope of this tutorial. Suffice it to say it should make things faster and simpler!
To install sf:
install.packages("sf")
library("sf")
The first exercise won’t be too thrilling, but we need to make sure everyone has the packages installed.
1.) Install sp and load sp into your library. 2.) Repeat, with sf,rgdal, raster, and rgeos.
Although we won’t be working with external GIS in this tutorial, there are several packages that provide ways to move back and forth from another GIS and R. For your reference, here are some.
spgrass6, but for the latest version of GRASS, GRASS 7.There are many packages for doing various types of spatial analysis in R. For this tutorial we will look at only a few
This package is a large suite of tools for species distribution modelling. As part of that suite, the FRAGSTATS metrics are inlcuded.
This is a huge package and will take care of pretty much all of your point pattern analysis needs.
Also another huge package for spatial analysis. This one is dense, but has all of your autocorrelation/variogram functions!
SDMTools, spatstat, and spdep.Reading in data with sp requires rgdal which supports many different data formats. For this quick tutorial lets see an example of reading in a shapefile in the current working directory.
rand <- readOGR(".","rand")
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "rand"
## with 100 features
## It has 1 fields
plot(rand)
To read in raster datasets we can use the raster packages raster() function. For most raster formats it is simply a matter of telling raster() to file location.
nlcd <- raster("nlcd.tif")
plot(nlcd)
Lastly, reading in data with sf is also pretty straightforward.
sf_rand <- st_read("rand.shp")
## Reading layer `rand' from data source `/home/jhollist/projects/r_landscape_tutorial/rand.shp' using driver `ESRI Shapefile'
## Simple feature collection with 100 features and 1 field
## geometry type: POINT
## dimension: XY
## bbox: xmin: -73.41471 ymin: 41.51605 xmax: -72.57539 ymax: 42.41291
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
One of the benefits of using sf is the speed. In my tests it is about twice as fast. Let’s look at a biggish shape file with 1 million points!
1 million points
#The old way
system.time(readOGR(".","big"))
## OGR data source with driver: ESRI Shapefile
## Source: ".", layer: "big"
## with 1000000 features
## It has 1 fields
## user system elapsed
## 13.164 0.220 13.383
#The sf way
system.time(st_read("big.shp"))
## Reading layer `big' from data source `/home/jhollist/projects/r_landscape_tutorial/big.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1000000 features and 1 field
## geometry type: POINT
## dimension: XY
## bbox: xmin: -71.03768 ymin: 41.05976 xmax: -69.09763 ymax: 43.00856
## epsg (SRID): 4326
## proj4string: +proj=longlat +datum=WGS84 +no_defs
## user system elapsed
## 5.820 0.072 5.894
readOGR(). These are unif.shp and clumped.shp. Name them unif and clumped, respectively.